Preparations

Load the necessary libraries

library(car)       #for regression diagnostics
library(broom)     #for tidy output
library(ggfortify) #for model diagnostics
library(sjPlot)    #for outputs
library(knitr)     #for kable
library(effects)   #for partial effects plots
library(emmeans)   #for estimating marginal means
library(ggeffects) #for partial effects plots
library(MASS)      #for glm.nb
library(MuMIn)     #for AICc
library(DHARMa)    #for residual diagnostics plots
library(performance) #for residuals diagnostics
library(see)         #for plotting residuals
library(modelr)    #for auxillary modelling functions
library(tidyverse) #for data wrangling

Scenario

Here is a modified example from Peake and Quinn (1993). Peake and Quinn (1993) investigated the relationship between the number of individuals of invertebrates living in amongst clumps of mussels on a rocky intertidal shore and the area of those mussel clumps.

Format of peakquinn.csv data files

AREA INDIV
516.00 18
469.06 60
462.25 57
938.60 100
1357.15 48
... ...
AREA Area of mussel clump mm2 - Predictor variable
INDIV Number of individuals found within clump - Response variable

The aim of the analysis is to investigate the relationship between mussel clump area and the number of non-mussel invertebrate individuals supported in the mussel clump.

Read in the data

peake = read_csv('../public/data/peakquinn.csv', trim_ws=TRUE)
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   AREA = col_double(),
##   INDIV = col_double()
## )
glimpse(peake)
## Rows: 25
## Columns: 2
## $ AREA  <dbl> 516.00, 469.06, 462.25, 938.60, 1357.15, 1773.66, 1686.01, 1786…
## $ INDIV <dbl> 18, 60, 57, 100, 48, 118, 148, 214, 225, 283, 380, 278, 338, 27…

Exploratory data analysis

Model formula: \[ y_i \sim{} \mathcal{Pois}(\lambda_i)\\ ln(\lambda_i) = \beta_0 + \beta_1 ln(x_i) \] where the number of individuals in the \(i^th\) observation is assumed to be drawn from a Poisson distribution with a \(\lambda\) (=mean) of \(\lambda_i\). The natural log of these expected values is modelled against a linear predictor that includes an intercept (\(\beta_0\)) and slope (\(\beta_i\)) for natural log transformed area. expected values are

Fit the model

Partial plots

Model investigation / hypothesis testing

Predictions

Summary figures

References

Peake, A. J., and G. P. Quinn. 1993. “Temporal Variation in Species-Area Curves for Invertebrates in Clumps of an Intertidal Mussel.” Ecography 16: 269–77.